-
Notifications
You must be signed in to change notification settings - Fork 8
Add Example Application: Llama 3.2 (1 billion parameters) #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
99e576f to
62bf26b
Compare
ad36747 to
9da3796
Compare
📊 Test Results for Test Example Applications3e0443f (2025_11_13_19_59_33) IRONCLADTested on
📈 Trends (vs main branch) for Test Example Applications3e0443f (2025_11_13_19_59_33) IRONCLAD Trendsllama_3.2_1b
|
📊 Test Results for Small Benchmark/Test Suite3e0443f (2025_11_13_21_22_28) IRONCLADTested on
📈 Trends (vs main branch) for Small Benchmark/Test Suite3e0443f (2025_11_13_21_22_28) IRONCLAD Trendsaxpy_1_cols_2_channels_2048_tile_2048_3.0
axpy_2_cols_2_channels_2048_tile_1024_3.0
axpy_4_cols_2_channels_2048_tile_512_3.0
axpy_8_cols_2_channels_2048_tile_256_3.0
dequant_1_cols_1_channels_2048_tile_2048
dequant_1_cols_2_channels_2048_tile_1024
dequant_2_cols_1_channels_2048_tile_1024
dequant_2_cols_2_channels_2048_tile_512
dequant_4_cols_1_channels_2048_tile_512
dequant_4_cols_2_channels_2048_tile_256
dequant_8_cols_1_channels_2048_tile_256
dequant_8_cols_2_channels_2048_tile_128
eltwise_add_1_cols_2_channels_2048_tile_2048
eltwise_add_2_cols_2_channels_2048_tile_1024
eltwise_add_4_cols_2_channels_2048_tile_512
eltwise_add_8_cols_2_channels_2048_tile_256
eltwise_mul_1_cols_2_channels_2048_tile_2048
eltwise_mul_2_cols_2_channels_2048_tile_1024
eltwise_mul_4_cols_2_channels_2048_tile_512
eltwise_mul_8_cols_2_channels_2048_tile_256
gelu_1_cols_1_channels_2048_tile_2048
gelu_1_cols_2_channels_2048_tile_1024
gelu_2_cols_1_channels_2048_tile_1024
gelu_2_cols_2_channels_2048_tile_512
gelu_4_cols_1_channels_2048_tile_512
gelu_4_cols_2_channels_2048_tile_256
gelu_8_cols_1_channels_2048_tile_256
gelu_8_cols_2_channels_2048_tile_128
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0
layer_norm_1_cols_1_channels_2048_tile_2048
layer_norm_1_cols_2_channels_2048_tile_1024
layer_norm_2_cols_1_channels_2048_tile_1024
layer_norm_2_cols_2_channels_2048_tile_512
layer_norm_4_cols_1_channels_2048_tile_512
layer_norm_4_cols_2_channels_2048_tile_256
layer_norm_8_cols_1_channels_2048_tile_256
layer_norm_8_cols_2_channels_2048_tile_128
matrix_vector_mul_128x128_32_1col
matrix_vector_mul_2048x8192_1_1col
matrix_vector_mul_2048x8192_1_2col
matrix_vector_mul_2048x8192_1_4col
matrix_vector_mul_2048x8192_1_8col
matrix_vector_mul_8192x2048_4_1col
matrix_vector_mul_8192x2048_4_2col
matrix_vector_mul_8192x2048_4_4col
matrix_vector_mul_8192x2048_4_8col
mem_copy_16_cores_2_chans_2048_tile_128_False
mem_copy_1_cols_1_channels_2048_tile_2048
mem_copy_1_cols_2_channels_2048_tile_1024
mem_copy_1_cores_1_chans_2048_tile_2048_False
mem_copy_2_cols_1_channels_2048_tile_1024
mem_copy_2_cols_2_channels_2048_tile_512
mem_copy_2_cores_1_chans_2048_tile_1024_False
mem_copy_2_cores_2_chans_2048_tile_1024_False
mem_copy_4_cols_1_channels_2048_tile_512
mem_copy_4_cols_2_channels_2048_tile_256
mem_copy_4_cores_1_chans_2048_tile_512_False
mem_copy_4_cores_2_chans_2048_tile_512_False
mem_copy_8_cols_1_channels_2048_tile_256
mem_copy_8_cols_2_channels_2048_tile_128
mem_copy_8_cores_1_chans_2048_tile_256_False
mem_copy_8_cores_2_chans_2048_tile_256_False
mha
relu_1_cols_1_channels_2048_tile_2048
relu_2_cols_1_channels_2048_tile_1024
relu_4_cols_1_channels_2048_tile_512
relu_8_cols_1_channels_2048_tile_256
rms_norm_1_cols_1_channels_2048_tile_2048
rms_norm_1_cols_2_channels_2048_tile_1024
rms_norm_2_cols_1_channels_2048_tile_1024
rms_norm_2_cols_2_channels_2048_tile_512
rms_norm_4_cols_1_channels_2048_tile_512
rms_norm_4_cols_2_channels_2048_tile_256
rms_norm_8_cols_1_channels_2048_tile_256
rms_norm_8_cols_2_channels_2048_tile_128
rope_1_cols_2_channels_4096_tile_4096_0
rope_2_cols_2_channels_4096_tile_2048_0
rope_4_cols_2_channels_4096_tile_1024_0
rope_8_cols_2_channels_4096_tile_512_0
silu_1_cols_1_channels_2048_tile_2048
silu_2_cols_1_channels_2048_tile_1024
silu_4_cols_1_channels_2048_tile_512
silu_8_cols_1_channels_2048_tile_256
softmax_1_cols_2_channels_4096_tile_2048
softmax_2_cols_2_channels_4096_tile_1024
softmax_2_cols_2_channels_4096_tile_512
swigluNo metrics available. transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s
weighted_rms_norm_1_cols_2_channels_2048_weights_2048
weighted_rms_norm_2_cols_2_channels_2048_weights_1024
weighted_rms_norm_4_cols_2_channels_2048_weights_512
weighted_rms_norm_8_cols_2_channels_2048_weights_256
|
|
Just a suggestion: |
Co-authored-by: pvasireddy-amd <pvasired@amd.com> Co-authored-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com> Co-authored-by: cubansil <CurtJohn.Bansil@amd.com> Co-authored-by: André Rösti <an.roesti@gmail.com>
Co-authored-by: cubansil <CurtJohn.Bansil@amd.com> Co-authored-by: Victor Jung <victor.jung@amd.com> Co-authored-by: pvasireddy-amd <pvasired@amd.com> Co-authored-by: André Rösti <an.roesti@gmail.com>
9da3796 to
28058d1
Compare
📊 Test Results for Test Example Applicationsbef0e25 (2025_11_14_21_10_56) IRONCLADTested on
📈 Trends (vs main branch) for Test Example Applicationsbef0e25 (2025_11_14_21_10_56) IRONCLAD Trendsllama_3.2_1b
|
📊 Test Results for Test Example Applications8c53610 (2025_11_14_22_53_33) IRONCLADTested on
📈 Trends (vs main branch) for Test Example Applications8c53610 (2025_11_14_22_53_33) IRONCLAD Trendsllama_3.2_1b
|
This reverts commit 345fcb4.
786a684 to
d442395
Compare
📊 Test Results for Test Example Applications82970e5 (2025_11_14_23_20_05) IRONCLADTested on
📈 Trends (vs main branch) for Test Example Applications82970e5 (2025_11_14_23_20_05) IRONCLAD Trendsllama_3.2_1b
|
📊 Test Results for Test Example Applications8404bd4 (2025_11_14_23_48_04) IRONCLADTested on
📈 Trends (vs main branch) for Test Example Applications8404bd4 (2025_11_14_23_48_04) IRONCLAD Trendsllama_3.2_1b
|
📊 Test Results for Small Benchmark/Test Suite8404bd4 (2025_11_15_00_04_44) IRONCLADTested on
📈 Trends (vs main branch) for Small Benchmark/Test Suite8404bd4 (2025_11_15_00_04_44) IRONCLAD Trendsaxpy_1_cols_2_channels_2048_tile_2048_3.0
axpy_2_cols_2_channels_2048_tile_1024_3.0
axpy_4_cols_2_channels_2048_tile_512_3.0
axpy_8_cols_2_channels_2048_tile_256_3.0
dequant_1_cols_1_channels_2048_tile_2048
dequant_1_cols_2_channels_2048_tile_1024
dequant_2_cols_1_channels_2048_tile_1024
dequant_2_cols_2_channels_2048_tile_512
dequant_4_cols_1_channels_2048_tile_512
dequant_4_cols_2_channels_2048_tile_256
dequant_8_cols_1_channels_2048_tile_256
dequant_8_cols_2_channels_2048_tile_128
eltwise_add_1_cols_2_channels_2048_tile_2048
eltwise_add_2_cols_2_channels_2048_tile_1024
eltwise_add_4_cols_2_channels_2048_tile_512
eltwise_add_8_cols_2_channels_2048_tile_256
eltwise_mul_1_cols_2_channels_2048_tile_2048
eltwise_mul_2_cols_2_channels_2048_tile_1024
eltwise_mul_4_cols_2_channels_2048_tile_512
eltwise_mul_8_cols_2_channels_2048_tile_256
gelu_1_cols_1_channels_2048_tile_2048
gelu_1_cols_2_channels_2048_tile_1024
gelu_2_cols_1_channels_2048_tile_1024
gelu_2_cols_2_channels_2048_tile_512
gelu_4_cols_1_channels_2048_tile_512
gelu_4_cols_2_channels_2048_tile_256
gelu_8_cols_1_channels_2048_tile_256
gelu_8_cols_2_channels_2048_tile_128
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0
layer_norm_1_cols_1_channels_2048_tile_2048
layer_norm_1_cols_2_channels_2048_tile_1024
layer_norm_2_cols_1_channels_2048_tile_1024
layer_norm_2_cols_2_channels_2048_tile_512
layer_norm_4_cols_1_channels_2048_tile_512
layer_norm_4_cols_2_channels_2048_tile_256
layer_norm_8_cols_1_channels_2048_tile_256
layer_norm_8_cols_2_channels_2048_tile_128
matrix_vector_mul_128x128_32_1col
matrix_vector_mul_2048x8192_1_1col
matrix_vector_mul_2048x8192_1_2col
matrix_vector_mul_2048x8192_1_4col
matrix_vector_mul_2048x8192_1_8col
matrix_vector_mul_8192x2048_4_1col
matrix_vector_mul_8192x2048_4_2col
matrix_vector_mul_8192x2048_4_4col
matrix_vector_mul_8192x2048_4_8col
mem_copy_16_cores_2_chans_2048_tile_128_False
mem_copy_1_cols_1_channels_2048_tile_2048
mem_copy_1_cols_2_channels_2048_tile_1024
mem_copy_1_cores_1_chans_2048_tile_2048_False
mem_copy_2_cols_1_channels_2048_tile_1024
mem_copy_2_cols_2_channels_2048_tile_512
mem_copy_2_cores_1_chans_2048_tile_1024_False
mem_copy_2_cores_2_chans_2048_tile_1024_False
mem_copy_4_cols_1_channels_2048_tile_512
mem_copy_4_cols_2_channels_2048_tile_256
mem_copy_4_cores_1_chans_2048_tile_512_False
mem_copy_4_cores_2_chans_2048_tile_512_False
mem_copy_8_cols_1_channels_2048_tile_256
mem_copy_8_cols_2_channels_2048_tile_128
mem_copy_8_cores_1_chans_2048_tile_256_False
mem_copy_8_cores_2_chans_2048_tile_256_False
mha
relu_1_cols_1_channels_2048_tile_2048
relu_2_cols_1_channels_2048_tile_1024
relu_4_cols_1_channels_2048_tile_512
relu_8_cols_1_channels_2048_tile_256
rms_norm_1_cols_1_channels_2048_tile_2048
rms_norm_1_cols_2_channels_2048_tile_1024
rms_norm_2_cols_1_channels_2048_tile_1024
rms_norm_2_cols_2_channels_2048_tile_512
rms_norm_4_cols_1_channels_2048_tile_512
rms_norm_4_cols_2_channels_2048_tile_256
rms_norm_8_cols_1_channels_2048_tile_256
rms_norm_8_cols_2_channels_2048_tile_128
rope_1_cols_2_channels_4096_tile_4096_0
rope_2_cols_2_channels_4096_tile_2048_0
rope_4_cols_2_channels_4096_tile_1024_0
rope_8_cols_2_channels_4096_tile_512_0
silu_1_cols_1_channels_2048_tile_2048
silu_2_cols_1_channels_2048_tile_1024
silu_4_cols_1_channels_2048_tile_512
silu_8_cols_1_channels_2048_tile_256
softmax_1_cols_2_channels_4096_tile_2048
softmax_2_cols_2_channels_4096_tile_1024
softmax_2_cols_2_channels_4096_tile_512
swigluNo metrics available. transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s
weighted_rms_norm_1_cols_2_channels_2048_weights_2048
weighted_rms_norm_2_cols_2_channels_2048_weights_1024
weighted_rms_norm_4_cols_2_channels_2048_weights_512
weighted_rms_norm_8_cols_2_channels_2048_weights_256
|
Added
Changed
example->operatorsexamplesdirectory to hold llamaPR Merge Checklist
develcommit and pointing todevel.